Word Embedding Evaluation and Combination
نویسندگان
چکیده
Word embeddings have been successfully used in several natural language processing tasks (NLP) and speech processing. Different approaches have been introduced to calculate word embeddings through neural networks. In the literature, many studies focused on word embedding evaluation, but for our knowledge, there are still some gaps. This paper presents a study focusing on a rigorous comparison of the performances of different kinds of word embeddings. These performances are evaluated on different NLP and linguistic tasks, while all the word embeddings are estimated on the same training data using the same vocabulary, the same number of dimensions, and other similar characteristics. The evaluation results reported in this paper match those in the literature, since they point out that the improvements achieved by a word embedding in one task are not consistently observed across all tasks. For that reason, this paper investigates and evaluates approaches to combine word embeddings in order to take advantage of their complementarity, and to look for the effective word embeddings that can achieve good performances on all tasks. As a conclusion, this paper provides new perceptions of intrinsic qualities of the famous word embedding families, which can be different from the ones provided by works previously published in the scientific literature.
منابع مشابه
Perform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کاملCombination of Adaptive-Grid Embedding and Redistribution Methods on Semi Structured Grids for two-dimensional invisid flows
Among the adaptive-grid methods, redistribution and embedding techniques have been the focus of more attention by researchers. Simultaneous or combined adaptive techniques have also been used. This paper describes a combination of adaptive-grid embedding and redistribution methods on semi-structured grids for two-dimensional invisid flows. Since the grid is semi-structured, it is possible to us...
متن کاملCombination of Adaptive-Grid Embedding and Redistribution Methods on Semi Structured Grids for two-dimensional invisid flows
Among the adaptive-grid methods, redistribution and embedding techniques have been the focus of more attention by researchers. Simultaneous or combined adaptive techniques have also been used. This paper describes a combination of adaptive-grid embedding and redistribution methods on semi-structured grids for two-dimensional invisid flows. Since the grid is semi-structured, it is possible to us...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملEvaluation of the effects of autologous adipose derived mesenchymal stem cells in combination with polyacrylamide hydrogel and nanohydroxyapatite scaffolds on healing in rabbit critical-sized radial bone defect model
Objective: In this study, the bone regeneration ability of polyacrylamide hydrogel and nanohydroxyapatite scaffolds (PAAH/NHA) and stem cells derived from adipose tissue (ADSCs) in the healing of critical sized bone defects in rabbit radius were assessed. Animals and procedures: 12 New Zealand white male rabbits were divided into 3 groups. The rabbits were anesthetized and 15 mm bone def...
متن کامل